{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 287.Find the Duplicate Number 找到重复的数\n",
    "\n",
    "### 难度：Medium\n",
    "\n",
    "## 刷题内容\n",
    "\n",
    "> 原题链接\n",
    "\n",
    " - 中文：https://leetcode-cn.com/problems/find-the-duplicate-number/description/\n",
    " - 英文：https://leetcode.com/problems/find-the-duplicate-number\n",
    " \n",
    "> 内容描述\n",
    "\n",
    "```\n",
    "给定一个包含 n + 1 个整数的数组 nums，其数字都在 1 到 n 之间（包括 1 和 n），可知至少存在一个重复的整数。假设只有一个重复的整数，找出这个重复的数。\n",
    "\n",
    "示例 1:\n",
    "输入: [1,3,4,2,2]\n",
    "输出: 2\n",
    "\n",
    "示例 2:\n",
    "输入: [3,1,3,4,2]\n",
    "输出: 3\n",
    "\n",
    "说明：\n",
    "不能更改原数组（假设数组是只读的）。\n",
    "只能使用额外的 O(1) 的空间。\n",
    "时间复杂度小于 O(n2) 。\n",
    "数组中只有一个重复的数字，但它可能不止重复出现一次。\n",
    "```\n",
    "\n",
    "## 解题方案\n",
    "\n",
    "> 思路 1\n",
    "\n",
    "实际上，我们在阅读完题目的时候，直观感觉，题目不是很难。但是，在我们看到下面的说明，也就是对我们题目的限制条件的时候，会猛然发现，题目的难度瞬间提升了好多倍。\n",
    "\n",
    "下面我列出咱们需要注意的点：\n",
    " - 包含 n+1 个整数的数组\n",
    " - 数组中的数字都在 1 到 n 之间（包括 1 和 n ，但是可以不连续，比如 [1,4,4,3,4]）\n",
    " - 重复的数字，可以重复很多次，并不限于重复 1 次，2次，3次..\n",
    " - 不能更改原数组（这条是最要命的，原本我的想法是，我们可以排序完成之后，再进行二分查找法，碍于这个规定，无法进行）\n",
    " - 只能用额外的 O(1) 的空间。（O(1) 空间的意思是不会因数组的长度改变而改变，对应本题就是不管我们的数组多长，我们能用的额外控制只能是 m 这个m 是一个确定的数，不会随着 n 的改变而改变。）\n",
    " - 时间的复杂度小于 $O(n^2)$\n",
    " \n",
    "注意的点差不多就是上面所说的了。\n",
    "\n",
    "```\n",
    "这个思路我们使用 二分查找（binary search）+ 鸽笼原理（Pigeonhole Principle）\n",
    "参考维基百科关于鸽笼原理的词条链接：https://en.wikipedia.org/wiki/Pigeonhole_principle\n",
    "\n",
    "\"不允许修改数组\" 与 \"常数空间复杂度\" 这两个限制条件意味着：禁止排序，并且不能使用 Map 等数据结构\n",
    "\n",
    "小于 O(n^2) 的运行时间复杂度可以联想到使用二分将其中的一个 n 化简为 log n\n",
    "可以参考 LeetCode Discuss:https://leetcode.com/discuss/60830/python-solution-explanation-without-changing-input-array\n",
    "\n",
    "二分枚举答案范围，使用鸽笼原理进行检验\n",
    "\n",
    "根据鸽笼原理，给定 n+1 个范围为 [1, n]的整数，其中一定存在数字出现至少两次。\n",
    "假设枚举的数字为 n / 2 ：\n",
    "遍历数组，若数组中不大于 n / 2 的数字个数超过 n / 2 ，则可以确定 [1, n/2] 范围内一定有解，否则可以确定解落在 (n/2, n]范围内。\n",
    "```\n",
    "\n",
    "也可以这样分析一下：\n",
    "```\n",
    "\n",
    "如果n 是5，那么就会有1 2 3 4 5 一共5个数字的可能，而array size 是6，那么其中一个数字肯定会至少出现两次。\n",
    "\n",
    "如果没有重复的数字，小于等于1的数字 出现的次数 等于 1；\n",
    "\n",
    "小于等于2的数字 出现的次数 等于 2；\n",
    "\n",
    "... 同理3；4；5。\n",
    "\n",
    "如果有重复的数字，如果重复的是1，那么 小于等于1的数字 出现的次数 肯定大于1；\n",
    "\n",
    "基于这个理论，我们可以在1 2 3 4 5 选出一个 mid， 遍历array来count 小于等于mid 的数字个数 小于等于 它自己mid 还是 大于 mid？\n",
    "\n",
    "如果count 小于等于mid， 说明 1 到 mid 这些数字 没有重复项， 重复项在 右半边 mid 到n， 所以缩小到右半边继续搜索；\n",
    "\n",
    "如果count 大于mid， 说明  1 到 mid 这些数字中 有重复项，缩小到 左半边继续搜索。\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3\n"
     ]
    }
   ],
   "source": [
    "class Solution(object):\n",
    "    def findDuplicate(self, nums):\n",
    "        \"\"\"\n",
    "        :type nums: List[int]\n",
    "        :rtype: int\n",
    "        \"\"\"\n",
    "        low, high = 1, len(nums) - 1\n",
    "        while low <= high:\n",
    "            mid = (low + high) >> 1\n",
    "            cnt = sum(x <= mid for x in nums)\n",
    "            if cnt > mid:\n",
    "                high = mid - 1\n",
    "            else:\n",
    "                low = mid + 1\n",
    "        return low\n",
    "    \n",
    "s = Solution()\n",
    "nums = [1,2,3,3]\n",
    "print(s.findDuplicate(nums))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "思路 1 的解法是时间复杂度为 $O(nlogn)$ 的解法，思路 2 则是时间复杂度为 $O(n)$ ，但是相对来说，是投机取巧了些。\n",
    "> 思路 2\n",
    "\n",
    "一次遍历统计，另一次遍历输出即可。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3\n"
     ]
    }
   ],
   "source": [
    "class Solution(object):\n",
    "    def findDuplicate(self, nums):\n",
    "        \"\"\"\n",
    "        :type nums: List[int]\n",
    "        :rtype: int\n",
    "        \"\"\"\n",
    "        dic = dict()\n",
    "        for n in nums:\n",
    "            dic[n] = dic.get(n, 0) + 1\n",
    "            if dic[n] >= 2:\n",
    "                return n\n",
    "\n",
    "s = Solution()\n",
    "nums = [1,2,3,3]\n",
    "print(s.findDuplicate(nums))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "> 思路 3 \n",
    "\n",
    "思路 3 则是更加完整的 $O(n)$ 的解法，现在我还没有完全搞懂，我先写在下面吧，大佬们可以提前学习了解。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "3\n"
     ]
    }
   ],
   "source": [
    "class Solution(object):\n",
    "    def findDuplicate(self, nums):\n",
    "        # The \"tortoise and hare\" step.  We start at the end of the array and try\n",
    "        # to find an intersection point in the cycle.\n",
    "        slow = 0\n",
    "        fast = 0\n",
    "    \n",
    "        # Keep advancing 'slow' by one step and 'fast' by two steps until they\n",
    "        # meet inside the loop.\n",
    "        while True:\n",
    "            slow = nums[slow]\n",
    "            fast = nums[nums[fast]]\n",
    "    \n",
    "            if slow == fast:\n",
    "                break\n",
    "    \n",
    "        # Start up another pointer from the end of the array and march it forward\n",
    "        # until it hits the pointer inside the array.\n",
    "        finder = 0\n",
    "        while True:\n",
    "            slow   = nums[slow]\n",
    "            finder = nums[finder]\n",
    "    \n",
    "            # If the two hit, the intersection index is the duplicate element.\n",
    "            if slow == finder:\n",
    "                return slow\n",
    "            \n",
    "s = Solution()\n",
    "nums = [1,2,3,3]\n",
    "print(s.findDuplicate(nums))"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
